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ABSTRACT 



The shape of the curves defined by the counts of radio sources per unit area as 
a function of their flux density was one of the earhest cosmological probes. Radio 
source counts continue to be an area of astrophysical interest as they can be used to 
study the relative populations of galaxy types in the Universe (as well as investigate 
any cosmological evolution in their respective luminosity functions). They are also a 
vital consideration for determining how source confusion may limit the depth of a 
radio interferometer observation, and are essential for characterising the extragalac- 
tic foregrounds in Cosmic Microwave Background experiments. There is currently no 
consensus as to the relative populations of the faintest (sub-mJy) source types, where 
the counts show a turn-up. Most of the source count data in this regime are gathered 
from multiple observations that each use a deep, single pointing with an interfero- 
metric radio telescope. These independent count measurements exhibit large amounts 
of scatter (factors of order a few) that significantly exceeds their respective stated 
uncertainties. In this article we use a simulation of the extragalactic radio continuum 
emission to assess the level at which sample variance may be the cause of the scatter. 
We find that the scatter induced by sample variance in the simulated counts decreases 
towards lower flux density bins as the raw source counts increase. The field-to-field 
variations make significant contributions to the scatter in the measurements of counts 
derived from deep observations that consist of a single pointing, and could even be 
the sole cause at >100 //Jy. We present a method for evaluating the flux density limit 
that a radio survey must reach in order to reduce the count uncertainty induced by 
sample variance to a specific value. We also derive a method for correcting Poisson 
errors on source counts from existing and future deep radio surveys in order to include 
the uncertainties due to the cosmological clustering of sources. A conclusive empir- 
ical constraint on the effect of sample variance at these low luminosities is unlikely 
to arise until the completion of future large-scale radio surveys with next-generation 
radio telescopes. 

Key words: galaxies: general - galaxies: statistics - radio continuum: galaxies 



1 INTRODUCTION 

Astrophysical radio emission, at least that which we observe 
away from the plane of the Milky Way, tends to originate 
from extragalactic objects at great distances. The differen- 
tial countaH of these distant radio sources formed one of the 
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^ i.e. the number of sources per unit area on the sky with flux 
densities in the interval S ^ S + dS. 



earliest cosmological probes (e.g. Longair, 1966). In a non- 
expanding Euclidean universal populated with non-evolving 
sources we would see the integrated source counts n{S) scal- 
ing with source flux density S according to the relationship 
n{S) oc S~^^^. Observed departures from this relation- 

A Euclidean universe filled with sources of luminosity L with 
number density n contains A'^ = Annd^/S such sources out to 
distance d. Since the flux S = L/4iTd'^ it is trivial to show that 
N{S) (X 
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ship thus inform on the geometry of the Universe, and ra- 
dio source counts were being invoked as early as the 1950s 
as one of the key evidential cruxes in the Big Bang versus 
Steady State debate (Ryle & Clarke, 1961), a cosmologi- 
cal contention that was eventually effectively ended by the 
discovery of the Cosmic Microwave Background (CMB) ra- 
diation (see e.g. Longair, 2004). 

Source counts are thus an area of study that is almost 
as old as the science of radio astronomy itself. Today the 
primary interest in source counts (across the whole electro- 
magnetic spectrum) stems from the need to determine the 
contributions that different galaxy populations make to the 
total number of objects in the Universe, in particular the rel- 
ative numbers of star-forming galaxies and those harbour- 
ing active-galactic nuclei (AGN), and how the luminosity 
functions of these populations evolve over cosmic time. Ra- 
dio source counts are essential for foreground subtraction in 
CMB experiments, and are also vital for determining where 
confusion becomes a fundamental limitation in a radio syn- 
thesis image. This may occur either due to classical confu- 
sion imposed by the sources at the faint end of the distri- 
bution that lie within the target field (Condon, 2009), or 
due to the presence of point spread function sidelobes asso- 
ciated with the brighter sources that lie in distant regions 
of the array primary beam (Smirnov et al, in prep.). This 
is particularly relevant at present as we await the arrival of 
the next-generation of radio instruments. These have been 
designed to deliver ultra-deep radio imaging and fast sur- 
vey speeds by virtue of their extreme sensitivities and novel 
detector technologies, eventually culminating in the deploy- 
ment of the Square Kilometre Array (SKA; Dewdney et al., 
2009). 

The faint end of the source count distribution is of par- 
ticular interest, and there are many publications on the na- 
ture of the sub-mjy source population. The 1.4 GHz counts 
exhibit a turn up at <1 mjy (e.g. Windhorst et al., 1984; 
Hopkins et al., 1999), that persists at higher frequencies 
(e.g. Fomalont et al., 2002; Heywood et al. 2013). Many pub- 
lications assert the nature of the source population at these 
levels and it is generally accepted that this is due to the 
increasing dominance of star-forming galaxies over AGN at 
these low luminosities (e.g. Seymour et al., 2008; Padovani et 
al., 2009), although radio- weak AGN and FR-I type galax- 
ies may make still make significant contributions (Jarvis & 
Rawlings, 2004; Simpson et al., 2006; Gendre & Wall, 2008; 
Smolcic et al., 2009). There is however no clear consensus 
as to the relative fractions that these objects occupy. 

Additional interest in the faintest end of the radio 
source counts was recently stimulated due to the balloon- 
borne Absolute Radiometer for Cosmology, Astrophysics 
and Diffuse Emission (ARCADE2; Fixsen et al., 2011) ex- 
periment which detected a significant excess in the sky 
brightness temperature at 3 GHz (Seiffert et al., 2011). 
These data suggest that if the result is genuine there must 
be a significant population of hitherto unknown faint radio 
emitters responsible for the excess (Vernstrom et al, 2011). 
Condon et al. (2012) performed a probability of deflection 
(P(-D); Scheuer, 1957) analysis of a confusion-limited Very 
Large Array (VLA) image at 3 GHz with a depth of 1 fijy. 
Their results suggest that if the ARCADE2 result is in- 
deed produced by a population of discrete radio sources then 



they are exceptionally numerous, not associated with known 
galaxies and must have 1.4 GHz flux densities of <0.03 fj,Jy. 

Clearly there remains much to learn from surveys of 
extragalactic radio sources in the fijy regime. Examination 
of the differential source counts from multiple surveys im- 
mediately highlights an issue that blights the current data: 
interpretation of the measured source counts at flux densi- 
ties <1 mJy proves difficult when the derived source counts 
from survey to survey do not agree to within their respec- 
tive errors. Possible explanations for the scatter include dif- 
ferent calibration accuracies, uncertainties in the method of 
correcting for the array primary beam and smearing effects 
(e.g. Section 2.4, Fomalont et al., 2006), correction of de- 
tection thresholds due to resolved sources (e.g. Section 3.2, 
Bondi et al., 2008), as well as non- instrumental considera- 
tions such as the clustering bias of the sources in the field, 
i.e. due to sample variance. 

Avoiding sample variance in faint source counts requires 
a large-area sky survey down to sub-mJy depths, which 
would require multiple, deep pointings on existing radio 
telescopes. Condon (2007) investigated the effect of sam- 
ple variance by measuring the count fluctuations in 17 non- 
overlapping VLA pointings from the Spitzer First Look Sur- 
vey and determined that the scatter due to sample variance 
is (1.07 ± 0.26) times the fluctuations expected in the ab- 
sence of source clustering, concluding that the field-to-field 
variations are likely to be non-cosmic in origin. We take a 
different approach to quantifying the effect of sample vari- 
ance by exploiting an existing extragalactic sky simulation 
in order to present a simple measurement of the scatter in- 
duced in the measured counts. For an in-depth review of the 
subject of radio source counts we refer the reader to de Zotti 
et al. (2010). 



2 METHOD AND RESULTS 

We investigate the effect of sample variance on the scatter in 
the measured source counts by comparing observationally- 
derived measurements with matched samples drawn from an 
existing simulation of the extragalactic radio sky. 

The data points and associated error bars on Figure 
[T]show the Euclidean-normalized differential source counts 
from a variety of radio surveys at 1.4 GHz. The observational 
source count data that we use for comparison is drawn from 
fourteen individual studies, most of which are conveniently 
tabulated by de Zotti et al. (2010). The solid angle sky cover- 
age of the individual surveys are partitioned into three bins: 
those that resulted from a single, deep pointing with the 
VLA, resulting in a nominal survey area of approximately 
0.196 deg'^ (hereafter known as the 'deep' bin; Mitchell & 
Condon, 1985; Biggs & Ivison, 2006; Fomalont et al., 2006; 
Kellermann et al., 2008; Owen & Morrison, 2008; Seymour 
et al., 2008; Ibar et al., 2009), surveys covering approxi- 
mately 4-4.5 deg^ (hereafter referred to as the 'broad' bin; 
Ciliegi et al., 1999; Gruppioni et al., 1999; Hopkins et al., 
2003), and finally surveys that were in general conducted 
over sky areas that exceeded the footprint of the simula- 
tion described below, and thus cannot be compared. These 
include the source counts derived from the FIRST survey 
(White et al., 1997), as well as those from the targeted sur- 
veys of Bridle et al. (1972). Also plotted on the figure are 
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Figure 1. Left panel: The data points and tiie corresponding error bars show the observationally-derived Euclidean-normalized dif- 
ferential source counts at 1.4 GHz from the publications listed in Section [2] The colours correspond to the three bins into which the 
observations are divided. The blue points correspond to counts derived from a single VLA pointing, the red points are derived from 
surveys covering 4-4.5 square degrees, and the black points are from various other (generally larger area) surveys, displayed here in 
order to present the full source count distribution. The blue and red shaded areas show the full range of source counts derived from 
independent samplings of a the extragalactic sky simulation of Wilman et al. (2008; 2010) with areas matched to the blue and red 
observational data points. Right panel: This panel zooms in on the 10 fijj - 1 mjy flux density region. The blue data points are the 
same as those on the left hand panel. The black line here shows the mean simulated source counts and the shaded regions that surround 
it correspond to 1, 2, 3 and 5 standard deviations as measured from the 1936 source count measurements in each bin. The data shown 
on this figure are available from the authors. 



the counts from the 2 deg^ radio survey of the COSMOS 
field (Schinnerer et al., 2004; Bondi et al, 2008). 

Counts from surveys in the deep and broad bins are 
plotted on Figure [1] as the blue and red points respectively. 
The smaller black points correspond to all other surveys. 
Immediately apparent from this selection and colour-coding 
alone is the large spread in source counts derived from the 
deep sample. This is the issue we aim to address with the 
simulation. 

Our next step is to compare these measured values to 
matched samples of simulated source counts. For this we 
make use of the semi-empirical extragalactic sky simula- 
tion (hereafter referred to as 'the simulation') of Wilman 
et al. (2008; 20100. Briefly, the simulation uses observed 
and extrapolated luminosity functions to populate an evolv- 
ing dark matter skeleton with various galaxy types. The 
20 X 20 deg^ sky area of the simulation contains ~260 mil- 
lion sources down to a flux density limit of approximately 
10 nJy. 

We extract multiple sky patches with areas of 0.196 and 
4.5 deg^ from the simulation for comparison to the mea- 
sured counts in both the deep and broad bins. This process 
results in 1936 and 64 unique source catalogues for the deep 
and broad samples respectively. For each of these simulated 
source subsets we compute the Euclidean-normalized differ- 
ential source counts in 58 logarithmically-spaced flux den- 
sity bins from 10 nJy to 100 Jy. For each bin the maximum 
and minimum value of the counts delineate a region on the 



The simulation database can be accessed online via 
|http : / /s-cubed . physics . ox . ac. uk] 



left hand panel of Figure [T] that corresponds to the possible 
range of field-to-fleld fluctuations in the source counts of a 
survey of matched area. This is plotted for both the deep 
bin (blue area) and the broad bin (red area). We stress that 
this process is not blighted by the biases inherent in deriving 
accurate counts from observations, such as those mentioned 
briefly in Section [T] and the scatter will be induced purely 
by the source clustering, itself governed by the underlying 
model dark matter density field upon which the simulated 
galaxy population is placed. Our chosen bin widths are well 
matched to those used in the observations: for every flux 
density bin used in the set of observations we calculate the 
ratio of that bin width to that of the simulated bin that is 
closest to it in terms of central flux, and the mean value of 
these ratios over all bins considered is 0.96 with a median 
value of 0.83. 

The right hand panel of Figure [1] shows a zoomed-in re- 
gion covering the 10 /xJy to 1 mJy flux density region. Again 
the blue points show the observed source counts for sin- 
gle pointing experiments. The mean value of the simulated 
counts from the 1936 independent distributions in each bin 
is shown by the black line. The shaded regions surrounding 
this correspond to 1, 2, 3 and 5 times the standard deviation 
of the count measurements. 

Figure [1] clearly shows that the scatter induced in the 
source counts by the clustering of radio sources across the 
sky for a survey of fixed area is thus strongly dependent 
on the depth of the survey, due to the unmodified surface 
density of radio sources rising with decreasing fiux density. 
Observational challenges notwithstanding, larger areas are 
required to accurately quantify the counts of faint radio 
sources. Count fiuctuations induced by sample variance are 
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Table 1. Survey depths (detection thresholds, not rms sensitivities) required to restrict the scatter in the source counts imposed by 
sample variance to values of 1%, 5%, 10% and 25% of the mean value of the source counts in that flux density bin. These are presented 
as a function of survey area, and all values are in /jJy. The values are derived from polynomial fits in log space to the measured ratios 
of the standard deviation to the mean. Polynomial coefficients are also provided, sec Figure [2] and the text for details. 



Area (deg^) 


Itrmt 


lirmt 


ciO% 

limit 


q25% 
limit 


Pi 


P2 


P3 


P4 


0.1 




2.5 


17.96 


155.1 


-0.00842 


-0.07982 


0.20276 


0.85972 


0.3 




10.31 


61.56 


500.0 


-0.00809 


-0.07383 


0.21948 


0.63105 


0.5 




17.96 


107.2 


870.6 


-0.00872 


-0.08273 


0.17371 


0.4505 


1.1 




45.24 


253.9 


2193 


-0.00962 


-0.09393 


0.12197 


0.20776 


1.5 


0.102 


69.63 


367.5 


3174 


-0.01073 


-0.10866 


0.0578 


0.05038 


2.1 


0.348 


94.75 


531.8 


5197 


-0.0095 


-0.09286 


0.10834 


0.0151 


3.1 


1.055 


155.1 


818.5 


7521 


-0.0102 


-0.09855 


0.10431 


-0.03349 


4.1 


3.008 


211.1 


1113 


13930 


-0.01431 


-0.15594 


-0.13708 


-0.41124 


4.9 


3.848 


270.1 


1425 


16750 


-0.01321 


-0.13809 


-0.05466 


-0.34036 




Figure 2. The solid lines show standard deviations (cr) per flux 
density bin for a range of theoretical (colour coded) survey areas, 
expressed as a fraction of the mean source counts (/t) in that bin. 
This plot essentially shows the detection threshold that a survey 
needs to reach to limit the uncertainty induced in the counts by 
sample variance to a specific level. Polynomials are fitted to the 
base-10 logarithms of the distributions, as shown by the dashed 
lines. Details are given in the text and coefficients are provided in 
Table[l] along with depth requirements for 1,5, 10 and 25% values 
of the count uncertainty (delineated by the horizontal lines). The 
vertical lines show the detection thresholds that must be reached 
in order to deliver 5% uncertainty for each area. 



significant enough to dominate tlie observed scatter at flux 
densities above ~100 nJy, and contribute significantly below 
this. Notable outliers on Figure [T] are the anomalously high 
and rising count values from Owen & Morrison (2008). The 
P{D) analysis of Condon et al. (2012) was conducted over 
the same field as the Owen & Morrison (2008) observations, 
partially motivated by the prospect of confirming the high 
counts previously seen in that region. Condon et al. (2012) 
determine new counts with their 8" resolution VLA C-array 
observations that are a factor of ~4 lower than the existing 
ones derived from the multi-configuration, 1.6" resolution 
observations of Owen & Morrison (2008), and speculate that 
overestimation of the resolution corrections are responsible 
for the discrepancy. 



There is a deviation of the measured broad area counts 
from the corresponding simulated samples in the left hand 
panel of Figure [1] below approximately 150 /iJy. At this 
depth the broad area counts are drawn solely from the 
Phoenix Deep Survey (Hopkins et al., 2003). This survey in- 
cludes a deeper tier that has an effective area that is notably 
less (~1-1.5 deg^) than the 4.5 deg^ probed by the multiple 
samples of the simulation, and it is from this smaller, deeper 
region that these counts originate. The deviation illustrates 
that even on scales of ~1 deg^ the sampling variation in the 
counts is not negligible. 

As noted by Wilman et al. (Section 4, 2008), in or- 
der to predict the behaviour of the radio sky at levels that 
are beyond present observation requires extrapolation of the 
known luminosity functions. We naturally cannot rule out 
departures of the simulation from reality below the limits 
of the observationally measured source counts. Our results 
are also sensitive to the accuracy of the clustering model 
in the simulation. Wilman et al. (2008) test the validity of 
the source clustering by comparing the simulated and mea- 
sured angular two point correlation functions, and find good 
agreement. For further details, including potential (less sig- 
nificant) limitations of the simulation we refer the reader to 
Wilman et al. (2008). 

Note also that the brightest end of the source counts 
also have uncertainties in the measurements comparable to 
those associated with the faintest counts. The effect that 
causes the large scatter is analogous at both ends of the 
scale: in the case of the bright sources it is a combination 
of small effective survey volumes for nearby sources and the 
intrinsic rarity of extremely bright sources at large distances, 
resulting in low number counts in both scenarios. 

The following two subsections broaden the utility of the 
above results by presenting a pair of tools for observers who 
wish to carry out deep radio surveys in order to investigate 
the faint radio source population. 



2.1 Optimisation of survey area according to flux 
density detection threshold 

Here we present a method for approximately evaluating the 
area that a survey of a given detection threshold must cover 
in order to limit the uncertainty in the counts induced by 
sample variance to a certain level. The standard deviation 
derived from the multiple count samples per flux density bin 
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(a) is expressed as a fraction of the mean count value (jj.) 
in that bin, and these data are plotted in log space as solid 
lines on Figure [2l These calculations are performed for a 
representative group of nine survey areas ranging from 0.1 
to 4.9 deg^, as listed on the legend of Figure [2l Testing sky 
areas larger than this becomes problematic as the number 
of independent catalogues that can be extracted from the 
simulation decreases with sky area. This is reflected in the 
increasing ripple levels of the curves on Figure [2] as the sky 
area increases. 

A good approximation to the measured curves is pro- 
vided by a least-squares fitted polynomial of the form 

log(/i/<7) = pi + P2log(5) + p-AogiSf + P4log{Sf. (1) 

The fitted curves are shown by the dashed lines on Figure 
(2] The coefficients p„ are provided in Table [T] for the nine 
survey areas, allowing the approximate uncertainties to be 
calculated for arbitrary surveys. As this is a polynomial fit 
it should not be used to extrapolate outside the range of the 
data to which it was fitted, however the lower limit of 10 njy 
is the formal flux-density limit of the simulation, and the 
source counts are generally well constrained observationally 
beyond the 10 mjy upper limit and up to the rare >1 Jy 
population. 

Table [T] also lists the survey limits required to reduce 
the scatter in the source counts to 1, 5, 10 and 25% of the 
mean values (shown by the horizontal lines on Figure [2| for 
the nine hypothetical surveys. To illustrate how these limits 
are determined the 5% case is presented as an example by 
the colour coded vertical lines on Figure [51 Note that the 
four smallest sky areas do not provide the accuracy to ever 
reach a 1% uncertainty within the limits of the simulation, 
hence the missing values in Table [T] 



2.2 Corrections for Poisson uncertainties in order 
to include the effects of source clustering 

The sample variance is equivalent to the variance of the 
counts in the cells into which the simulation is divided, and 
consists of two components, namely the Poisson variance 
and a second contribution caused purely by the cosmologi- 
cal clustering of the sources. In this section we provide an 
estimate of the contribution to the sample variance that is 
solely due to source clustering as a function of flux density 
and survey solid angle. This allows existing and future ex- 
periments that measure the counts of faint radio sources to 
correct their Poisson errors in order to include clustering 
effects. 

The la percentage errors due to both Poisson scatter 
(cTp) and sample variance ((Tg ) can be calculated for the 
simulated Euclidean-normalized differential source counts 
for each flux density bin. An estimate of pure Poisson er- 
rors that does not include the effects of source clustering is 
derived by randomising the position of each source in the 
simulation and measuring the variance of the counts in each 
cell. This procedure is carried out 100 times and the mean 
variance is used to calculate the la Poisson percentage error 
CTp. The sample variance uncertainty is taken as the stan- 
dard deviation of the individual count values in each cell of 
the unperturbed simulation, as per the la limits presented 
in Section [2. II These calculations are performed in flux den- 
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Figure 3. Values of crg^ for seven survey solid angles in the range 
0.003 to 3.0 deg^. The values of ctqj^ are for use in Equation |4] 
in order to apply a correction to observationally-derived Poisson 
errors in order to include the cosmological clustering of sources. 
The sky areas covered by this plot should ensure that it remains 
useful for single-pointing observations with future radio telescopes 
such as MeerKAT and the dish component of the SKA. The faint 
dotted curves show the mean fractional percentage Poisson errors 
for comparison to existing theory at the end of Section 12.21 



sity bins with a logarithmic width of 0.2 Jy over the full 
flux-density range of the simulation. 

How can the contribution to the sample variance that 
is purely due to cosmological source clustering be distilled? 
We assume that the source clustering multiplies the number 
of galaxies in each independent cell by a factor / that has a 
mean value of 1. The rms percentage scatter in this factor 
is denoted by cr^"^, and is independent of the raw source 
counts in any given bin (and thus independent of the Pois- 
son errors). Furthermore, the factor / is assumed to be a 
function of flux density that varies slowly enough such that 
/ can be treated as constant across each flux density bin in 
which sources are counted. 

If the distribution of radio sources were devoid of any 
clustering then the Poisson variance (ap) would be the 
sole cause of the scatter in the Euclidean-normalized counts 
(Nbin) in any given flux-density bin. We assume that the 
source clustering contributes to the measured variance (erg ) 
from the simulation in a way that conforms to the behaviour 
of the / parameter described above, i.e. the clustering ad- 
justs the measured counts to a value of / x N^n. The sample 
variance (i.e. the variance of the counts in each cell of the 
simulation, erg ) is thus the quadratic sum of the Poisson 
variance (ap) and the additional variance due to cosmolog- 
ical clustering (0"^^). It does not drop to zero even in the 
absence of any source clustering. We can extract the rms 
percentage scatter in / using error propagation rules: 




since in the absence of clustering the Poisson variance is 
the sole contributor to the sample variance. The parameter 
acL is independent of the choice of bin width, and its values 



6 Heywood, Jarvis & Condon 



derived from our simulation can be used in conjunction with 
an observationally-derived value of o-p to determine 



%obs 
0"S 



%obs 



%obs\2 
P ) 



100^ 



(3) 



(4) 



where N^^^ is the number of sources in that flux density bin. 

Figure [3] shows the values of ct^^ derived from the 
simulation that are applicable to faint flux density bins 
(10.0 njy < Scentre < 0.3 mjy) for a range of effective sur- 
vey solid angles. For a given measurement of the Euclidean- 
normalized differential source counts, Figure[3]can be used in 
conjunction with Equation |4] in order to correct the percent- 
age error estimate {crg°°'"') in the observed counts (Af^^) to 
include clustering effects. We impose the condition that for 
the derived value of a^j^ to be trustworthy, it must exceed 
5(Tp. This is to account for the fact that the Poisson errors 
derived from flux density bins containing average counts of 
<1 cannot be reliably used. These conditions lead to the cut- 
offs in the lines on Figure[3l The cut-offs manifest themselves 
at fainter flux densities with smaller survey solid angles as 
the raw source counts per bin decrease with sky coverage. 

The seven sky survey areas in Figure [3] cover the range 
0.003 to 3.0 deg^. The smallest areas are chosen to make the 
figure relevant for the current deepest observations, where 
the faintest sources are detected in effective areas much 
smaller than the primary beam size. The broader areas make 
the plot relevant for future radio continuum surveys with 
MeerKAT (13.5 m dishes) and the SKA (15 m dishes j3. 

We can compare our predictions for the effects of source 
clustering to the measurement of Condon (2007). Seventeen 
independent pointings of approximately 0.2 deg^ each were 
extracted from the Spitzer First Look Survey, and with ap- 
proximately 100 sources per field with a flux density limit 
of 150 fiJy, our simulation predicts a acL value of approxi- 
mately 12.5%, as shown by the intersecting dashed lines on 
Figure [S] Applying these values to Equation |4] results in a 
percentage error in the observed counts of a'^"''" = 16%. 
This is slightly higher than but still broadly consistent with 
the observed value of (10.7 ± 2.6)%. 

The shapes of the aQj^ curves on Figure [3] are wor- 
thy of comment as they say something about the clustering 
strength of radio sources as a function of their flux densities. 
The plot shows the area-dependent trend that one would in- 
stinctively expect. The effect of source clustering rises with 
flux density although this is not a smooth change over the 
plotted range. This is likely due to the brighter end of the 
source counts likely being dominated by more massive ellip- 
tical galaxies that are more strongly clustered than the faint 
sources, the less clustered star-forming spiral galaxies. 

Finally, we compare the trend that these lines exhibit to 
existing theory. Clustering will increase the variance of the 
source counts in each individual cell. If each cell contains 

* The Australian SKA Pathfinder (ASKAP) is a special case as 
it has been designed to deliver an instantaneous field of view at 
1.4 GHz of ~30 deg2. The sample variance contribution due to 
the clustering of cosmological sources is not likely to be an issue 
for the surveys that are planned for it. 



sources in a solid angle and a (fairly narrow) flux-density 
range AS, then the mean number of sources per cell is 



TV = n{S)ASn 



(5) 



The sample variance can be written as the sum of the Pois- 
son variance N and the variance caused solely by clustering. 
Peebles (1980) expresses this in terms of 1^(9), the two-point 
correlation as a function of angular separation 6: 



N 

{{N - Nf) ^N + — I w{e)dnidn2 



(6) 



The function w{6) is usually approximated by a power-law 
of the form 



(7) 



Blake & WaU (2002a,b) measured w{e) in the range 0.1 < 
e (deg) < 10 for NRAO VLA Sky Survey (NVSS; Condon et 
al., 1998) sources stronger than about 10 mJy at 1 . 4 GHz 
and found A « 1.0 x 10"^, a « 0.8. iBlake et al.] (|2004l ) 
combined Sydney University Molonglo Sky Survey (SUMSS; 
Bock, Large & Sadler, 1999), NVSS, and Westerbork North- 
ern Sky Survey (WENSS; Rengelink et al., 1997) data to es- 
timate a slightly larger A « 1.6 x 10~^ and a slightly steeper 
a « 1.1. 

Following de Zotti et al. (2010), we note that the frac- 
tional variance 

{{N-m 



1 1 



(8) 



has the advantage that the clustering term does not explic- 
itly depend on N or AS. Using our notation 

{{N-Nf) 



where 



1 

TV 



r^CL ^ ^ I w{e)dQ.idQ.2 



2.36A 



deg^ 



V2 



(9) 



(10) 



is the fractional variance contributed by clustering alone. 
Thus 



% 

<^CL 



deg^ 



i/4 



(11) 



declines more slowly with Q. than the Poisson scatter, which 
is proportional to Q.~^^'^ , as is reflected in our results in 
Figure [3] in which the fainter dotted lines show the mean 
fractional percentage Poisson errors. 



3 CONCLUDING REMARKS 

Observationally-derived values of the counts of faint radio 
sources exhibit levels of scatter that can be up to a factor of 
several greater than the quoted uncertainties in the counts. 
We have provided an estimate of the scatter induced in the 
counts of faint radio sources due to the sample variance in- 
duced by cosmological source clustering by using many in- 
dependent samples of an extragalactic sky simulation, and 
comparing these results to matched observations. The deep- 
est observations to date have been carried out using single 
deep pointings with the VLA. The fluctuations induced by 
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sample variance in the counts derived from such an observa- 
tion may be large enough to completely explain the observed 
scatter at flux densities above approximately 100 /iJy, and 
we have quantifled their contribution as a function of survey 
area below this level. 

We have presented a method for estimating the count 
uncertainty induced by sample variance for an arbitrary ra- 
dio survey, or reciprocally, for determining the depth that 
a radio survey of fixed solid angle coverage must reach in 
order to limit the count uncertainty. We have also derived 
a method for correcting Poisson errors in order to include 
the effects of source clustering. This method is applicable 
to the deepest surveys that exist today and should remain 
applicable for future deep continuum surveys with the VLA, 
MeerKAT and the SKA, down to survey flux density limits 
of 0.1 nJy. We stress again the distinction between survey 
flux density limits and the rms sensitivity of the correspond- 
ing radio images when applying these methods. 

The amount that cosmological clustering affects the 
counts is as one would expect strongly dependent on survey 
area but also on flux density limits, likely due to the prefer- 
ential clustering of massive elliptical galaxies at the brighter 
end, with the less clustered star-forming spiral galaxies dom- 
inating the fainter counts. 

The method for correcting Poisson uncertainties is 
broadly consistent with the observationally-derived mea- 
surement of the count fluctuations presented by Condon 
(2007), who concluded that human-induced instrumental 
calibration and interpretative differences are likely to domi- 
nate the scatter. Such effects are certainly contributing fac- 
tors to the difference in published counts in cells between 
different authors; the potential overestimation of the reso- 
lution correction resulting in the very high counts of Owen 
& Morrison (2008) being a prime example that is not ex- 
plained by our results. The sample variance in the case of 
the deepest surveys such as this is only marginally larger 
than the actual Poisson variance due to the source counts 
per bin being very low, counted over effective areas much 
smaller than the primary beam size. 

Current facilities are not suited to deriving a low- 
uncertainty measurement of the faint radio source counts 
without an unfeasibly large investment of telescope time. 
It is likely that the issue will lack an empirical resolution 
until the completion of the next-generation of legacy radio 
surveys with future instruments such as ASKAP, MeerKAT 
and eventually the SKA. 
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